perm filename TESTEX.WEB[WEB,ALS] blob sn#659971 filedate 1982-05-13 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00016 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	% This program by D. E. Knuth is not copyrighted and can be used freely.
C00006 00003	@* Introduction.
C00031 00004	@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
C00038 00005	@* Input and output.
C00058 00006	@ The first 128 strings will contain 95 standard ascii characters, and the
C00060 00007	@* Reporting errors.
C00067 00008	@ When \TEX\ ``packages'' a list into a box, it needs to calculate the
C00069 00009	@* Packed data.
C00076 00010	@ Warning: If any changes are made to these data structure layouts, such as
C00077 00011	@ The following procedure, which is called just before \TEX\ initializes its
C00078 00012	@ Additional information about the current line is available via the
C00082 00013	@* File names.
C00095 00014	@ Note: A malformed \.{TFM} file might be shorter than it claims to be
C00097 00015	@ The last part of the postamble, following the phony font number
C00103 00016	@* The main program.
C00107 ENDMK
C⊗;
% This program by D. E. Knuth is not copyrighted and can be used freely.
% Version 0 is presently being implemented.

% Here is TEX material that gets inserted after \input webhdr
\magnify{1200}\setpage
\def\hang{\hangindent 3em\ \unskip\!}
\def\textindent#1{\hangindent 2.5em\noindent\hbox to 2.5em{\hss#1 }\!}
\chcode@@=13 \def@@{\penalty999\ } % ties words together
\def\TEX{T\hbox{\hskip-.1667em\lower.424ex\hbox{E}\hskip-.125em X}}
\font b=cmr9 \def\mc{\:b} % medium caps for names like PASCAL
\def\PASCAL{{\mc PASCAL}}
\def\ph{{\mc PASCAL-H}}
\font L=manfnt % font used for the METAFONT logo
\def\MF{{\:L META}\-{\:L FONT}}

\def\glob{17} % change this to the section number of "<Globals...>"
\def\gglob{27, 29} % change this to the next two sections of "<Globals...>"
\def\(#1){} % this is used to make module names sort themselves better
\def\9#1{} % this is used for sort keys in the index

\font D=cmtt at 15truept % font used in the title line below (only)
\font E=cmr7 at 14truept % font used in the title line below (only)

\def\title{\TEX}
\def\contentspagenumber{1}
\def\topofcontents{\topspace 0pt
	\vfill
	\ctrline{\:E The {\:D \TEX} processor}
	\vskip 15pt
	\ctrline{(excerpts only---for change-file generation)}
	\vfill}
\def\botofcontents{\vfill
	\ctrline{\ragged0\spaceskip0pt\xspaceskip0pt\baselineskip9pt
		\hbox par 5in{\:bThis is a very abbreviated portion
		of a very preliminary version of the real program that
		has been extracted so that a very first attempt may be
		made to write \TEX82 change files for computers other
		than the {\mc DEC}20.}}}
\setcount0 \contentspagenumber
\topofcontents
\ctrline{(replace this page by the contents page printed later)}
\botofcontents
\mark{1}\eject
@* Introduction.
This is \TEX, a document compiler intended for high-quality typesetting.
The \PASCAL\ program that follows is the definition of \TEX82, a standard
@:PASCAL}{\PASCAL@>
@:TEX82}{\TEX82@>
version of \TEX\ that is designed to be highly portable so that identical output
will be obtainable on a great variety of different computers.

The main purpose of the following program is to explain the algorithms of \TEX\
as clearly as possible. As a result, the program will not necessarily be very
efficient when a particular \PASCAL\ compiler has translated it into a
particular machine language. However, the program has been written so that it
can be tuned to run efficiently in a wide variety of operating environments
by making comparatively few changes. Such flexibility is possible because
the documentation that follows is written in the \.{WEB} language, which is
at a higher level than \PASCAL; the preprocessing step that converts \.{WEB}
to \PASCAL\ is able to introduce most of the necessary refinements.
Semi-automatic translation to other languages is also feasible, because the
program below does not make extensive use of features that are peculiar to
\PASCAL.

@ Different \PASCAL s have slightly different conventions, and the present
\def\ph{{\mc PASCAL-H}}
@:PASCAL H}{\ph@>
program expresses \TEX\ in terms of the \PASCAL\ that was
available to the author in 1981. The methods used here to work with
this particular compiler, which we shall call \ph, should help the
reader to see how to make an appropriate interface for other systems
if necessary. (\ph\ is Charles Hedrick's modification of a compiler for the
@↑Hedrick, Charles Locke@>
DECsystem-10 that was originally developed at the University of Hamburg; cf.\
{\sl SOFTWARE---Practice \AM\ Experience \bf6} (1976), 29--42. The \TEX\
program below is intended to be adaptable, without extensive changes,
to most other versions of \PASCAL,
so it does not fully use the admirable features of \ph.)

The portions of this program that involve system-dependent code, where
changes might be necessary because of differences between \PASCAL\ compilers
and/or differences between
operating systems, can be identified by looking at the sections whose
numbers are listed under `system dependencies' in the index. Furthermore,
the index entries for `dirty \PASCAL' list all places where the restrictions
of \PASCAL\ have not been followed perfectly, for one reason or another.
@!@↑system dependencies@>
@!@↑dirty \PASCAL@>

@ The program begins with a normal \PASCAL\ program heading, whose
components will be filled in later, using the conventions of \.{WEB}.
@.WEB@>
For example, the portion of the program called `\X\glob:Globals in the outer
block\X' here will be replaced by a sequence of variable declarations
that starts in $\section\glob$ of this documentation. In this way, we are able
to define each individual global variable
when we are prepared to understand what it means; we do not have to define
all of the globals at once.
Cross references in $\section\glob$, where it says ``See also sections
\gglob, $\ldots$,'' also make it
possible to look at the set of all global variables, if desired.
Similar remarks apply to the other portions of the program heading.

Actually the heading shown here is not quite normal: The |program| line
does not mention any |output| file, because \ph\ would ask the \TEX\ user
to specify a file name if |output| were specified here.
@↑system dependencies@>

@d mtype==t@&y@&p@&e {this is a \.{WEB} coding trick:}
@f mtype==type {`\&{mtype}' will be equivalent to `\&{type}'}
@f type==true {but `|type|' will not be treated as a reserved word}

@p @t\4@>@<Compiler directives@>@/
program TEX; {all file names are defined dynamically}
label @<Labels in the outer block@>@/
const @<Constants in the outer block@>@/
mtype @<Types in the outer block@>@/
var@?@<Globals in the outer block@>@/
@#
procedure initialize; {this procedure gets things started properly}
	var@?@<Local variables for initialization@>@/
	begin @<Initialize whatever \TEX\ might access@>@;
	end;@#
@<Basic printing procedures@>@/
@<Error handling procedures@>@/

@ The overall \TEX\ program begins with the heading just shown, after which
comes a bunch of procedure declarations and function declarations.
Finally we will get to the main
program, which begins at label `|start_here|'. If you want to skip down to the
main program now, you can look up `|start_here|' in the index.
But the author suggests that the best way to understand this program
is to follow pretty much the order of \TEX's components as they appear in the
\.{WEB} description you are now reading, since the present ordering is
intended to combine the advantages of
the ``bottom up'' and ``top down'' approaches to the problem
of understanding a somewhat complicated system.

@d start_here=1 {this label marks the beginning of the program}
@d start_of_TEX=2 {go here when \TEX's memory is initialized}
@d end_of_TEX=9998 {go here to close files and terminate gracefully}
@d final_end=9999 {this label marks the ending of the program}

@<Labels in the out...@>=
start_here, start_of_TEX, end_of_TEX, final_end; {key points}

@ Some of the code below is intended to be used only when diagnosing the
strange behavior that sometimes occurs when \TEX\ is being installed or
when system wizards are fooling around with \TEX\ without quite knowing
what they are doing. Such code will not normally be compiled; it is
delimited by the codewords `$|debug|\ldotsm|gubed|$', with apologies
to people who wish to preserve the purity of English. Similarly, there
is some conditional code delimited by `$|stat|\ldotsm|tats|$'
that is intended only for use when statistics
are to be kept about \TEX's memory usage.
@↑debugging@>

@d debug==@{ {change this to `$\\{debug}\eqv\null$' when debugging}
@d gubed==@} {change this to `$\\{gubed}\eqv\null$' when debugging}
@f debug==begin
@f gubed==end
@#
@d stat==@{ {change this to `$\\{stat}\eqv\null$' when gathering
	usage statistics}
@d tats==@} {change this to `$\\{tats}\eqv\null$' when gathering
	usage statistics}
@f stat==begin
@f tats==end

@ This program has two important variations: (1) There is a long and slow
version called \.{INITEX}, which does the extra calculations need to
@.INITEX@>
initialize \TEX's internal tables; and (2)@@there is a shorter and faster
production version, which cuts the initialization to a bare minimum.
Parts of the program that are needed in (1) but not in (2) are delimited by
the codewords `$|init|\ldotsm|tini|$'.

@d init== {change this to `$\\{init}\eqv\.{@@\{}$' in the production version}
@d tini== {change this to `$\\{tini}\eqv\.{@@\}}$' in the production version}
@f init==begin
@f tini==end

@<Initialize whatever...@>=
@<Set initial values of key variables@>@/
init @<Initialize table entries (done by \.{INITEX} only)@>@;@+tini

@ If the first character of a \PASCAL\ comment is a dollar sign,
@↑system dependencies@>
\ph\ treats the comment as a list of ``compiler directives'' that will
affect the translation of this program into machine language.  The
directives shown below specify full checking and inclusion of the \PASCAL\
debugger when \TEX\ is being debugged, but they cause range checking and other
redundant code to be eliminated when the production system is being generated.
Arithmetic overflow will be detected in all cases.
@↑Overflow in arithmetic@>

@<Compiler directives@>=
@{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead}
debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging}

@ This \TEX\ implementation conforms to the rules of the {\sl PASCAL User
@:PASCAL}{\PASCAL@>
@↑system dependencies@>
Manual} published by Jensen and Wirth in 1975, except where system-dependent
@↑Wirth, Niklaus@>
@↑Jensen, Kathleen@>
code is necessary to make a useful system program, and except in another
respect where such conformity would unnecessarily obscure the meaning
and clutter up the code: We assume that |case| statements may include a
default case that applies if no matching label is found. Thus, we shall use
constructions like
$$\vbox{\halign{\!#\hfil\cr
|case x of|\cr
1: $\langle\,$code for $x=1\,\rangle$;\cr
3: $\langle\,$code for $x=3\,\rangle$;\cr
|othercases| $\langle\,$code for |x≠1| and |x≠3|$\,\rangle$\cr
|endcases|\cr}}$$
since most \PASCAL\ compilers have plugged this hole in the language by
incorporating some sort of default mechanism. For example, the \ph\ 
compiler allows `|others|:' as a default label, and
other \PASCAL s allow syntaxes like `|else|' or `|otherwise|' or
`|otherwise|:', etc. The definitions of |othercases| and |endcases|
should be changed to agree with local conventions.
Note that no semicolon appears before |endcases| in this program,
so the definition of |endcases| should include a semicolon if the
compiler wants one. (Of course, if no
default mechanism is available, the |case| statements of \TEX\ will have
to be laboriously extended by listing all remaining cases. People who are
stuck with such \PASCAL s have in fact done this, successfully but not happily!)

@d othercases == others: {default for cases not listed explicitly}
@d endcases == @+end {follows the default case in an extended |case| statement}
@f othercases == else
@f endcases == end

@ The following parameters can be changed at compile time to extend or
reduce \TEX's capacity. They may have different values in \.{INITEX} and
in production versions of \TEX.
@.INITEX@>
@↑system dependencies@>

@<Constants...@>=
@!mem_max=30000; {greatest index in \TEX's internal |mem| array,
	must be strictly less than |max_halfword|}
@!buf_size=500; {maximum number of characters simultaneously present in
	current lines of open files}
@!error_line=64; {width of context lines on terminal error messages}
@!half_error_line=32; {width of first lines of contexts in terminal
	error messages, should be between 30 and |error_line-15|}
@!max_print_line=72; {width of longest text lines output, should be at least 60}
@!stack_size=80; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
	can be going on simultaneously}
@!font_max=65; {maximum number of distinct fonts per job, must not
	exceed |max_quarterword|}
@!max_font_code=300; {largest legal font code allowed in user programs}
@!font_mem_size=15000; {number of words of |font_info| for all fonts}
@!par_size=30; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=3000; {maximum number of strings}
@!pool_size=15000; {maximum number of characters in strings, including all
	error messages and help texts, and the names of all fonts and
	control sequences; must be a few thousand more than |string_vacancies|}
@!align_size=4; {maximum number of simultaneous alignments}
@!save_size=300; {space for saving values outside of current group}
@!trie_size=7000; {space for hyphenation patterns, should be larger for
	\.{INITEX} than it is in production versions of \TEX}

@↑system dependencies@>
@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TEX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TEX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.

@d mem_base=0 {smallest index in the |mem| array, must not be less
	than |min_halfword|}
@d hi_mem_base=12000 {smallest index in the single-word area of |mem|,
	must be larger than |mem_base| and smaller than |mem_max|}
@d font_base=0 {smallest internal font number, must not be less
	than |min_quarterword|}
@d hash_size=2100 {maximum number of control sequences; it should be at most
	about |mem_max-hi_mem_base/6|, but 2100 is already quite generous}
@d hash_prime=1777 {a prime number equal to about 85\%\ of |hash_size|}
@d hyph_size=307 {another prime; the number of \.{\\hyphenation} exceptions}
@d string_vacancies=8000 {the minimum number of characters that should be
	available for the user's control sequences and font names,
	after \TEX's own error messages are stored}

@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
character sets were common, so it did not make provision for lower case
letters. Nowadays, of course, we need to deal with both upper and lower case
alphabets in a convenient way, especially in a program for typesetting;
so the present specification of \TEX\ has been written under the assumption
that the \PASCAL\ compiler and run-time system permit the use of text files
with more than 64 distinguishable characters. More precisely, we assume that
the character set contains at least the letters and symbols associated
with ascii codes @'40 through @'176; all of these characters are now
available on most computer terminals.

Since we are dealing with more characters than were present in the first
\PASCAL\ compilers, we have to decide what to call the associated data
type. Some \PASCAL s use the original name |char| for the
characters in text files, even though there now are more than 64 such
characters, while other \PASCAL s consider |char| to be a 64-element
subrange of a larger data type that has some other name.

In order to accommodate this difference, we shall use the name |text_char|
to stand for the data type of the characters that are converted to and
from |ascii_code| when they are input and output. We shall also assume
that |text_char| consists of the elements |chr(first_text_char)| through
|chr(last_text_char)|, inclusive. The following definitions should be
adjusted if necessary.
@↑system dependencies@>

@d text_char == char {the data type of characters in text files}
@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
@d last_text_char=127 {ordinal number of the largest element of |text_char|}

@<Local variables for init...@>=
i:0..last_text_char;

@ The \TEX\ processor converts between ascii code and
the user's external character set by means of arrays |xord| and |xchr|
that are analogous to \PASCAL's |ord| and |chr| functions.

@<Globals...@>=
@!xord: array [text_char] of ascii_code;
	{specifies conversion of input characters}
@!xchr: array [ascii_code] of text_char;
	{specifies conversion of output characters}


@ The ascii code is ``standard'' only to a certain extent, since many
computer installations have found it advantageous to have ready access
to more than 94 printing characters. Appendix@@C of the \TEX\ manual
gives a complete specification of the intended correspondence between
characters and \TEX's internal representation.

If \TEX\ is being used
on a garden-variety \PASCAL\ for which only standard ascii
codes will appear in the input and output files, it doesn't really matter
what codes are specified in |xchr[1..@'37]|, but the safest policy is to
blank everything out by using the code shown below.

However, other settings of |xchr| will make \TEX\ more friendly on
computers that have an extended character set, so that users can type things
like \.\NE\ instead of \.{\\ne}. At MIT, for example, it would be more
appropriate to substitute the code
$$\hbox{|for i←1 to @'37 do xchr[i]←chr(i);|}$$
\TEX's character set is essentially the same as MIT's, even with respect to
characters less than@@@'40. People with extended character sets can
assign codes arbitrarily, giving an |xchr| equivalent to whatever
characters the users of \TEX\ are allowed to have in their input files,
provided that unsuitable characters do not correspond to the special
codes like |carriage_return| that are listed above. It is best
to make the codes correspond to the intended interpretations as shown
in Appendix@@C whenever possible, because of the way \TEX\ will interpret
characters when no \.{\\chcode} and \.{\\mathcode}
commands have changed the default interpretation; but this is not
necessary. For example, in countries with an alphabet of more than 26
letters, it is usually best to map the additional letters into codes less
than@@@'40.
@↑character set dependencies@>
@↑system dependencies@>

@<Set init...@>=
for i←1 to @'37 do xchr[i]←' ';
@* Input and output.
The bane of portability is the fact that different operating systems treat
input and output quite differently, perhaps because computer scientists
have not given sufficient attention to this problem. People have felt somehow
that input and output are not a part of ``real'' programming. Well, it is true
that some kinds of programming are more fun than others. With existing
input/output conventions being so diverse and so messy, the only sources of
joy in such parts of the code are the rare occasions when one can find a
way to make the program a little less bad than it might have been. We have
two choices: whether to attack I/O now and get it over with, or to postpone
it until near the end. Neither prospect is very attractive, so let's
get it over with.

The basic operations we need to do are (1)@@inputting and outputting of
text, to or from a file or the user's terminal; (2)@@inputting and
outputting of eight-bit bytes, to or from a file; (3)@@instructing the
operating system to initiate (``open'') or to terminate (``close'') input or
output from a specified file; (4)@@testing whether the end of an input
file has been reached.

Note that \TEX\ needs to deal with only two kinds of files.
We shall use the term |alpha_file| for a file that contains textual data,
and the term |byte_file| for a file that contains eight-bit binary information.
These two types turn out to be the same on many computers, but
sometimes there is a significant distinction, so we shall be careful to
distinguish between them. Standard protocols for transferring
such files from computer to computer, via high-speed networks, are
now in common use.

@<Types...@>=
@!eight_bits=0..255; {unsigned one-byte quantity}
@!alpha_file=packed file of text_char; {files that contain textual data}
@!byte_file=packed file of eight_bits; {files that contain binary data}

@ Most of what we need to do with respect to input and output can be handled
by the I/O facilities that are standard in \PASCAL, i.e., the routines
called |get|, |put|, |eof|, and so on. But
standard \PASCAL\ does not allow file variables to be associated with file
names that are determined at run time, so it cannot be used to implement
\TEX; some sort of extension to \PASCAL's ordinary |reset| and |rewrite|
is crucial for our purposes. We shall assume that |cur_name| is a variable
of an appropriate type such that the \PASCAL\ run-time system being used to
implement \TEX\ can open a file whose external name is specified by
|cur_name|.
@↑system dependencies@>

The \ph\ compiler with which the present version of \TEX\ was prepared has
extended the rules of \PASCAL\ in a very convenient way. To open file
|f|, we can write
$$\vbox{\halign{#\hfil\qquad⊗#\hfil\cr
|reset(f,name,'/O')|⊗for input;\cr
|rewrite(f,name,'/O')|⊗for output.\cr}}$$
The `|name|' parameter, which is of type `\!|packed
array[@t$\langle\\{any}\rangle$@>] of text_char|', stands for the name of
the external file that is being opened for input or output. Blank spaces
that might appear in |name| are ignored.  If a file of the specified name
cannot be found, or if such a file cannot be opened for some other reason
(e.g., it might already be in use), we will have |eof(f)=true| after an
unsuccessful |reset|, and |eof(f)=false| after an unsuccessful |rewrite|.
@:PASCAL H}{\ph@>

Therefore we can implement the file-opening procedures as follows,
where the functions return |false| if no file identified by |cur_name|
could be opened:


@p function a_open_in(var f:alpha_file):boolean;
	{open a text file for input}
begin reset(f,cur_name,'/O'); a_open_in←not eof(f);
end;
@#
function a_open_out(var f:alpha_file):boolean;
	{open a text file for output}
begin rewrite(f,cur_name,'/O'); a_open_out←eof(f);
end;
@#
function b_open_in(var f:byte_file):boolean;
	{open a binary file for input}
begin reset(f,cur_name,'/O'); b_open_in←not eof(f);
end;
@#
function b_open_out(var f:byte_file):boolean;
	{open a binary file for output}
begin rewrite(f,cur_name,'/O'); b_open_out←eof(f);
end;

@ \TEX\ refers to a few files by fixed names that are not supplied by
the user. The following system-dependent definitions specify values of
|cur_name| that correspond to these fixed file names. (See also the
definition of |memory_file| in the |const| section above.)
@↑system dependencies@>

@d default_err_name=='TEXPUT.ERR         ' {transcript file if name unspecified}
@d default_out_name=='TEXPUT.DVI         ' {output file if name unspecified}
@d string_pool_name=='TEX   .POOL        ' {string pool output by \.{WEB}}

@ Files can be closed with the \ph\ routine `|close(f)|', which
@↑system dependencies@>
should be used when all input or output with respect to |f| has been completed.
This makes |f| available to be opened again, if desired; and if |f| was used for
output, the |close| operation makes the corresponding external file appear
on the user's area, ready to be read.

@p procedure a_close(var f:alpha_file); {close a text file}
begin close(f);
end;
@#
procedure b_close(var f:byte_file); {close a binary file}
begin close(f);
end;

@ Binary input and output are done with \PASCAL's ordinary |get| and |put|
procedures, so we don't have to make any other special arrangements for
binary@@I/O. Text output is also easy to do with standard \PASCAL\ routines.
The treatment of text input is more difficult, however, because
of the necessary translation to |ascii_code| values, and because
\TEX's conventions should be efficient and they should
blend nicely with the user's operating environment.

@ Input from text files is read one line at a time, using a routine called
|input_ln|. This function is defined in terms of global variables
called |buffer|, |first|, and |limit|
that will be  described in detail later; for now, it suffices for us
to know that |buffer| is an array of |ascii_code| values, and that
|first| and |limit| are indices into this array representing the
beginning and ending of a line of text.

@d limit==cur_input.limit_field {end of current line in |buffer|}

@<Glob...@>=
@!buffer:array[0..buf_size] of ascii_code; {lines of characters being read}
@!first:0..buf_size; {the first unused position in |buffer|}

@ The |input_ln| function brings the next line of input from the specified
field into available positions of the buffer array and returns the value |true|,
unless the file has already been entirely read, in which case it returns
|false|. The |ascii_code| numbers that represent the next line of the file
are input into |buffer[first]|, |buffer[first+1]|, $\ldotss$, |buffer[limit-1]|;
and the global variable |limit| is set equal to |first| plus the length of the
line.

An overflow error is given, however, if the normal actions of |input_ln|
would make |limit≥buf_size|; this is done so that other parts of \TEX\
can safely look at the contents of |buffer[limit+1]| without overstepping
the bounds of the |buffer| array. Upon entry to |input_ln|, the condition
|first<buf_size| will always hold, so there is always room for an ``empty''
line.

This procedure does a |get| before looking at the first character of the
line, and it does not do a |get| when it reaches the end of the line.
Therefore it can be used to acquire input from the user's terminal as well
as from ordinary text files. Other parts of \TEX\ take care of inputting
the first line of a file, so that the first character is not lost.

@p function input_ln(var f:alpha_file):boolean; {inputs the next line
	or returns |false|}
begin get(f); {input the first character of the line into |f↑|}
limit←first;
if eof(f) then input_ln←false
else	begin while not eoln(f) do
		begin if limit+2>buf_size then overflow("buffer size",buf_size);
		buffer[limit]←xord[f↑]; get(f); incr(limit);
		end;
	input_ln←true;
	end;
end;

@ The user's terminal acts essentially like other files of text, except
that it is used both for input and for output. When the terminal is
considered an input file, the file variable is called |term_in|, and when it
is considered an output file the file variable is |term_out|.
@↑system dependencies@>

@<Glob...@>=
@!term_in:alpha_file; {the terminal as an input file}
@!term_out:alpha_file; {the terminal as an output file}

@ Here is how to open the terminal files in \ph:
@↑system dependencies@>

@d t_open_in==reset(term_in,'TTY:','/O') {open the terminal for text input}
@d t_open_out==rewrite(term_out,'TTY:','/O') {open the terminal for text output}

@ Sometimes it is necessary to synchronize the input/output mixture that
happens on the user's terminal, and two procedures are used for this
purpose. The first of these, |update_terminal|, is called when we want
to make sure that everything we have output to the terminal so far has
actually left the computer's internal buffers and been sent.
The other, |clear_terminal|, is called when we wish to cancel any
input that the user may have typed ahead (since we are about to
issue an unexpected error message). The following macros show how these
two operations can be specified in \ph:
@↑system dependencies@>

@d update_terminal == break(term_out) {empty the terminal output buffer}
@d clear_terminal == break_in(term_in,true) {clear the terminal input buffer}

@ We need a special routine to read the first line of \TEX\ input from
the user's terminal. This line is special because it is read before we
have opened the error transcript file; there is sort of a ``chicken and
egg'' problem here. If the user types `\.{\\input paper}' on the first
line, or if some macro invoked by that line does such an \.{\\input},
the transcript file will be named `\.{paper.err}'; but if no \.{\\input}
commands are performed during the first line of terminal input, the transcript
file will acquire its default name `\.{texput.err}'. (The transcript file
will not contain error messages generated by the first line before the
first \.{\\input} command.)

The first line is even more special if we are lucky enough to have an operating
system that treats \TEX\ differently from a run-of-the-mill \PASCAL\ object
program. It's nice to let the user start running a \TEX\ job by typing
a command line like `\.{tex paper}'; in such a case, \TEX\ will operate
as if the first line of input were `\.{paper}', i.e., the first line will
consist of the remainder of the command line, after the part that invoked
\TEX.

@ Different systems have different ways to get started, but regardless of
what conventions are adopted the routine that initializes the terminal
should satisfy the following specifications:

\yskip\textindent{1)}It should open file |term_in| for input from the
	terminal. (The file |term_out| will already be open for output to the
	terminal.)

\textindent{2)}If the user has given a command line, this line should be
	considered the first line of terminal input. Otherwise the
	user should be prompted with `\.*', and the first line of input
	should be whatever is typed in response to this.

\textindent{3)}The first line of input, which might or might not be a
	command line, should appear in locations 0 to |limit-1| of the
	|buffer| array.

\textindent{4)}The global variable |loc| should be set so that the
	character that \TEX\ reads next is in |buffer[loc]|. This
	character should not be blank, and we should have |loc<limit|.

\yskip\noindent(It may be necessary to prompt the user several times
before a nonblank line comes in.)

@d loc==cur_input.loc_field {location of first unread character in |buffer|}

@ The following program in \ph\ does the required initialization
without retrieving a possible command line.
It should be clear how to modify this routine to deal with command lines,
if the system permits them.
@↑system dependencies@>

@p procedure init_terminal; {gets the terminal input started}
label exit;
begin t_open_in;
loop@+begin write(term_out,'*'); update_terminal;
	if not input_ln(term_in) then {this shouldn't happen}
		begin write_ln(term_out);
		write(term_out,'! End of file on the terminal... why?');
@.End of file on the terminal...@>
		quit;
		end;
	loc←0;
	while (loc<limit)∧(buffer[loc]=" ") do incr(loc);
	if loc<limit then return; {return unless the line was all blank}
	write_ln(term_out,'Please type the name of your input file.');
	end;
exit:end;
@ The first 128 strings will contain 95 standard ascii characters, and the
other 33 characters will be printed in three-symbol form like `\.{\↑\↑A}'
unless a system-dependent change is made here. Installations that have
an extended character set, where for example |xchr[@'32]=@t\.{\'\NE\'}@>|,
would like string @'32 to be the single character @'32 instead of the
three characters @'136, @'136, @'132 (\.{\↑\↑Z}). On the other hand,
even people with an extended character set will want to represent string
@'15 by \.{\↑\↑M}, since @'15 is |carriage_return|; the idea is to
produce visible strings instead of tabs or line-feeds or carriage-returns
or bell-rings or characters that are treated anomalously in text files.

The boolean expression defined here should be |true| unless \TEX\ internal code
$k$ corresponds to a non-troublesome visible symbol in the local character
set, given that |k<@'40|.
At MIT, for example, the appropriate formula would be
`|k in [0,@'10..@'12,@'14,@'15,@'33]|'.
@↑character set dependencies@>
@↑system dependencies@>
@* Reporting errors.
When something anomalous is detected, \TEX\ typically does something like this:
$$\vbox{\halign{#\hfil\cr
|print_nl("! Something anomalous");|\cr
|help3("This is the first line of my offer to help.")|\cr
|("This is the second line. I'm trying to")|\cr
|("explain the best way for you to proceed.")|\cr
|error;|\cr}}$$
A two-line help message would be given using |help2|, etc.; these informal
helps should use simple vocabulary that complements the words used in the
official error message that was printed. (Outside of the U.S.A., the help
messages should preferably be translated into the local vernacular. Each
line of help is at most 60 characters long, in the present implementation,
so that |max_print_line| will not be exceeded.)

The |error| procedure supplies a `\..' after the official message, then it
shows the location of the error; and if |nonstop=pausing|, it also enters
into a dialog with the user, during which time the help message may be
printed.
@↑system dependencies@>

@ The |quit| procedure just cuts across all active procedure levels and
jumps out of the program to |end_of_TEX|. This is the only nonlocal
|@!goto| statement in \TEX. It is used when there is no recovery from a
particular error.

Some \PASCAL\ compilers do not implement non-local |goto| statements.
@↑system dependencies@>
In such cases the body of |quit| should be a call on |close_files_and_terminate|
followed by a call on some system procedure that terminates the program.

@<Error hand...@>=
procedure quit;
begin goto end_of_TEX;
end;

@ @<Get user's advice...@>=
loop@+begin continue: prompt_input("↑");
	if limit=first then return;
	c←buffer[first];
	if c≥"a" then c←c+"A"-"a"; {convert to upper case}
	@<Interpret code |c| and |return| if done@>;
	end

@ It is desirable to provide an `\.E' option here that gives the user
an easy way to return from \TEX\ to the system editor, with the offending
line ready to be edited. But such an extension requires some system
wizardry, so it is not standard in \TEX\ and not included here except
as a recommendation.
@↑system dependencies@>

There is a secret `\.D' option available if the debugging routines are
included with \TEX.
@↑debugging \TEX@>

@<Interpret code |c| and |return| if done@>=
case c of
"1","2","3","4","5","6","7","8","9": if deletions_allowed then
	@<Delete |c-"0"| tokens@>;
@t\4@>debug "D": begin debug_skipped←debug_cycle; debug_help;@+end;@+gubed@/
"H": if help_ptr>0 then @<Print the help information, |goto continue|@>
	else help2("Sorry, I don't know how to help in this situation.")@/
	("Maybe you should try asking a human?");
"I":@<Introduce new material from the terminal and |return|@>;
"Q","R","S":@<Enter a nonstop mode and |return|@>;
"X":begin prompt_input("Type X again to exit:");
	if (limit>first)∧((buffer[limit]="x")∨(buffer[limit]="X")) then quit;
	end;
othercases do_nothing
endcases;@/
@<Print the menu of available options@>

@ The commented-out part of the following menu
should be taken out of braces if the |"E"| option is implemented.
@↑system dependencies@>

@<Print the menu...@>=
print("Type <return> to continue, ");
print("S to scroll future error messages,");
print_nl("R to run without stopping, ");
print("Q to run quietly,");
print_nl("I to insert something, ");
@{@,@,if base_ptr>0 then print("E to edit your file,");@+@}@/
if deletions_allowed then
	print_nl("1 or ... or 9 to ignore the next 1 to 9 tokens of input,");
print_nl("H for help, X to quit.");

@ Users occasionally want to interrupt \TEX\ while it is running.
If the \PASCAL\ runtime system allows this, it would be nice to implement
a routine that sets the global variable |interrupt| to some nonzero value
when such an interrupt is signalled. Otherwise there is probably at least
a way to make |interrupt| nonzero using the \PASCAL\ debugger.
@↑system dependencies@>
@↑debugging@>

@d check_interrupt==begin if interrupt≠0 then pause_for_instructions;
		end

@<Global...@>=
@!interrupt:integer; {should \TEX\ pause for instruction?}

@ When \TEX\ ``packages'' a list into a box, it needs to calculate the
proportionality ratio by which the glue inside the box should stretch
or shrink. This calculation does not affect \TEX's decision making,
so the precise details of rounding, etc., in the glue calculation are not
of critical importance for the consistency of \TEX\ results on different
computers.

We shall use the type |glue_ratio| for such proportionality ratios.
An unsigned glue ratio should take the same amount of memory as an
|integer| (usually 32 bits) if it is to blend smoothly with \TEX's
other data structures. Thus |glue_ratio| should be equivalent to
|short_real| in some implementations of \PASCAL. Alternatively,
it is possible to deal with glue ratios in a fixed-point manner;
see {\sl TUGboat \bf3} (1982), 10--27.
@↑system dependencies@>

@<Types...@>=
@!glue_ratio=real; {one-word representation of a glue expansion factor}
@* Packed data.
In order to make efficient use of storage space, \TEX\ bases its major data
structures on a |memory_word|, which contains either a (signed) integer,
possibly scaled, or an (unsigned) |glue_ratio|, or a small number of
fields that are one half or one quarter of the size used for storing
integers.

If |x| is a variable of type |memory_word|, it contains up to four
fields that can be referred to as follows:
$$\vbox{\halign{\hfil#\hfil\cr
|x.int|\qquad\quad(an |integer|)\cr
|x.sc|\qquad(a |scaled| integer)\cr
|x.gr|\qquad(a |glue_ratio|)\cr
|x.hh.lh|, |x.hh.rh|\qquad(two halfword fields)\cr
|x.hh.b0|, |x.hh.b1|, |x.hh.rh|\qquad(two quarterword fields, one halfword
	field)\cr
|x.qqqq.b0|, |x.qqqq.b1|, |x.qqqq.b2|, |x.qqqq.b3|\qquad(four quarterword
	fields)\cr}}$$
This is somewhat cumbersome to write, and not very readable either, but
macros will be used to make the notation shorter and more transparent.
The \PASCAL\ code below gives a formal definition of |memory_word| and
its subsidiary types, using packed variant records. \TEX\ makes no
assumptions about the relative positions of the fields within a word.

Since we are assuming 32-bit integers, a halfword must contain at least
16 bits and a quarterword must contain at least 8 bits.
@↑system dependencies@>
But it doesn't hurt to have more bits; for example, with enough 36-bit
words you might be able to have |mem_max| as large as 262142, which is
eight times as much memory as anybody had during the first four years of
\TEX's existence.

N.B.: Valuable memory space will be dreadfully wasted unless \TEX\ is compiled
by a \PASCAL\ that packs all of the |memory_word| variants into
the space of a single integer. This means, for example, that |glue_ratio|
words should be |short_real| instead of |real| on some computers. Some
\PASCAL\ compilers will pack an integer whose subrange is `|0..255|' into
an eight-bit field, but others insist on allocating space for an additional
sign bit; on such systems you can get 256 values into a quarterword only
if the subrange is `|-128..127|'.

The present implementation tries to accommodate as many variations as possible,
so it makes rather general assumptions. If integers having the subrange
`|min_quarterword..max_quarterword|' can be packed into a quarterword,
and if integers having the subrange `|min_halfword..max_halfword|'
can be packed into a halfword, everything should work satisfactorily.
These quantities must satisfy the following restrictions (only):
$$\baselineskip 15pt
\vbox{\halign{$\hfil#\null$&$\null\L#\hfil$\quad
	     &$\hfil#\null$&$\null\R#\hfil$\cr
|min_quarterword|&0,&|max_quarterword|&127;\cr
|min_halfword|&0,&|max_halfword|&32767;\cr
|min_halfword|&|min_quarterword|,&|max_halfword|&|max_quarterword|;\cr
|min_halfword|&|mem_base|,&|max_halfword|&|mem_max|+1;\cr
|min_quarterword|&|font_base|,&|max_quarterword|&|font_max|.\cr}}$$

It is usually most efficient to have |min_quarterword=min_halfword=0|,
so one should try to achieve this unless it causes a severe problem.
The values defined here are recommended for most 32-bit computers.

@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==65535 {largest allowable value in a |halfword|}

@ The reader should study the following definitions closely:
@↑system dependencies@>

@d sc==int {|scaled| data is equivalent to |integer|}

@<Types...@>=
@!quarterword = min_quarterword..max_quarterword; {1/4 of a word}
@!halfword=min_halfword..max_halfword; {1/2 of a word}
@!two_choices = 1..2; {used when there are two variants in a record}
@!four_choices = 1..4; {used when there are four variants in a record}
@!two_halves = packed record@/
	@!rh:halfword;
	case two_choices of
	1: (@!lh:halfword);
	2: (@!b0:quarterword; @!b1:quarterword);
	end;
@!four_quarters = packed record@/
	@!b0:quarterword;
	@!b1:quarterword;
	@!b2:quarterword;
	@!b3:quarterword;
	end;
@!memory_word = packed record@/
	case four_choices of
	1: (@!int:integer);
	2: (@!gr:glue_ratio);
	3: (@!hh:two_halves);
	4: (@!qqqq:four_quarters);
	end;
@ Warning: If any changes are made to these data structure layouts, such as
changing any of the node sizes or even reordering the words of nodes,
the |copy_node_list| procedure and the memory initialization code
below may have to be changed. However, other references to the nodes are made
symbolically in terms of the \.{WEB} macro definitions above, so format changes
will leave \TEX's other algorithms intact.
@↑system dependencies@>
@ The following procedure, which is called just before \TEX\ initializes its
input and output, establishes the initial values of the date and time.
@↑system dependencies@>
Since standard \PASCAL cannot provide such information, some special code
is needed. The program here simply specifies July 4, 1776, at noon; but
users probably want a better approximation to the truth.

@p procedure fix_date_and_time;
begin time←12*60; {minutes since midnight}
day←4; {fourth day of the month}
month←7; {seventh month of the year}
year←1776; {Anno Domini}
end;
@ Additional information about the current line is available via the
|index| variable, which counts how many lines of characters are present
in the buffer below the current level. We have |index=0| when reading
from the terminal and prompting the user for each line; if the user types,
e.g., `\.{\\input paper}', we will have |index=1| while reading
the file \.{paper.tex}. However, it does not follow that |index| is the
same as the input stack pointer, since many of the levels on the input
stack come from token lists. For example, the instruction `\.{\\input paper}'
might occur in a token list.

The variable |in_open| is equal to the |index|
value of the highest non-token-list level. Thus, the number of partially read
lines in the buffer is |in_open+1|, and we have |in_open=index|
when we are not reading a token list.

If we are currently reading from the terminal, the value of
|term[index]| will be |true|; otherwise, of course, it is |false|,
and we are reading from |input_file[index]|. We use the notation
|terminal_input| as a convenient abbreviation for |term[index]|,
and |cur_file| as an abbreviation for |input_file[index]|.

If more information about the input state is needed, it can be
included in small arrays like those shown here. For example,
the current page or segment number in the input file might be
put into `|page:array[1..max_in_open] of halfword|'.
@↑system dependencies@>

@d terminal_input==term[index] {are we reading from the terminal?}
@d cur_file==input_file[index] {the current |alpha_file| variable}

@<Globals...@>=
@!in_open : 0..max_in_open; {the number of lines in the buffer, less one}
@!input_file : array[1..max_in_open] of alpha_file;
@!term : array[0..max_in_open] of boolean;
@!name_string:array[0..max_in_open] of str_number; {file names}

@ This routine should be changed, if necessary, to give the best possible
indication of where the current line resides in the input file.
For example, on some systems it is best to print both a page and line number.
@↑system dependencies@>

@<Print location of current line@>=
if terminal_input then
	if base_ptr=0 then print("<*>") else print("<**>")
else	begin print("l."); print_int(line);
	end;
print_char(" ");

@* File names.
Besides the fact that different operating systems treat files in different ways,
we must cope with the fact that completely different naming conventions
are used. The following programs show what is required for the {\mc WAITS}
operating system, on which \TEX\ was developed; similar routines for other
operating systems are not difficult to devise.

In order to isolate the system-dependent aspects of file names, the
@↑system dependencies@>
system-independent parts of \TEX\ make use of three system-dependent
procedures that are called |begin_name|, |more_name|, and |end_name|. In
essence, if the user-specified characters of the file name are $c↓1\ldotsm c↓n$,
the system-independent driver program does the operations
$$|begin_name|;\,|more_name|(c↓1);\ldotss;|more_name|(c↓n);
\,|end_name|.$$
These three procedures communicate with each other via global variables.
Afterwards the variable |cur_name|, which is of type |name_type|,
will contain the name in the form needed by the file-opening procedures.

Actually the situation is slightly more complicated, because \TEX\ needs
to know when the file name ends. The |more_name| routine is a function
(with side effects) that returns |true| on the calls |more_name|$(c↓1)$,
$\ldotss$, |more_name|$(c↓{n-1})$. The final call |more_name|$(c↓n)$
returns |false|; or, it returns |true| and the token following $c↓n$ is
something like `\.{\\hbox}' (i.e., not a character). In other words,
|more_name| is supposed to return |true| unless it is sure that the
file name has been completely scanned; and |end_name| is supposed to be able
to finish the assembly of |cur_name| whether or not |more_name|$(c↓n)$
returned |true| or |false|.

@ \TEX\ assumes that a file name has three parts: the name proper, its
`extension', and a `file area' where it is found in an external file system.
The extension of an input file or a send file is assumed to be `\.{tex}'
unless otherwise specified; it is `\.{err}' on the error transcript file
that records each run of \TEX; it is `\.{tfm}' on the font metric files
that describe characters in the fonts \TEX\ uses; it is `\.{dvi}' on
the output files that specify typesetting information; and it is `\.{mem}'
on the memory files written by \.{INITEX} to initialize \TEX. The file area can
be arbitrary on input files, but it must be the user's current area when a
file is output.  If an input file cannot be found on the specified area,
\TEX\ will look for it on a special system area; this special area is
intended for commonly used input files like \.{basic.tex}.

To implement these features, the |begin_name| procedure has an argument,
which is the default extension. There are two other procedures, called
|change_ext| and |change_area|, which override the extension and
file area that appear in the current file name being processed.
@↑system dependencies@>

@ File names at {\mc WAITS} have the form `$\alpha$' or `$\alpha\..\beta$'
possibly followed by a file-area specification having the form `$\.[\gamma\.,
\delta\.]$'. The extension part is, of course, $\beta$.

The strings $\alpha$, $\beta$, $\gamma$, and $\delta$ are limited in length
to 6, 3, 3, and 3 characters, respectively. The following procedures accommodate
local customs by using the first and last three characters of $\alpha$ if it
has seven or more, by using the first three characters of $\beta$ if it has four
or more, and by using the first nine characters of the area specification
$\.[\gamma\.,\delta\.]$ if more than nine are present. Long names $\alpha$
are compressed by using the first and last parts of the name, since this
gives unique codes distinguishing all of the necessary font names. For example,
`\.{helvetica}' and `\.{helveticab}' (bold helvetica) become `\.{helica}'
and `\.{helcab}', and `\.{oldenglish}' becomes `\.{oldish}'.

The user's stated file name is parsed into a string of length 19, where
positions |1..6| are for $\alpha$, |7..10| are for $\..\beta$, and
|11..19| are for $\.[\gamma\.,\delta\.]$. This string |unpacked_name| is
created by |begin_name| and |more_name|, then packed into |cur_name| by
|end_name|.

The standard file area for \TEX\ input is \.{[tex,sys]}.
@↑system dependencies@>

@d user_area == '         ' {specifies the user's personal files}
@d tex_file_area == '[tex,sys]' {look here if not found on user area}
@#
@d err_ext=='.err' {extension for transcript file}
@d tex_ext=='.tex' {default extension for input files}
@d tfm_ext=='.tfm' {extension for font metric files}
@d dvi_ext=='.dvi' {extension for tex output files}
@d mem_ext=='.mem' {extension for memory output files}
@d blank_ext=='    ' {used to suppress an extension}

@<Types...@>=
@!name_type = packed array[1..19] of text_char;
	{file name for |reset| or |rewrite|}
@!ext = packed array[1..4] of text_char; {for specifying extensions}
@!area = packed array[1..9] of text_char; {for specifying file areas}

@ Here are the global variables used for this particular implementation of
file name scanning:
@↑system dependencies@>

@<Globals...@>=
@!cur_name : name_type; {the result of file name scanning}
@!unpacked_name : array [1..19] of text_char; {unpacked version of |cur_name|}
@!name_ptr : 0..19; {current position in |unpacked_name|}
@!name_limit : 0..19; {|name_ptr| will not advance past this}

@ According to these conventions, it is easy to change the current
extension and the current area.
@↑system dependencies@>

@p procedure change_ext(@!e:ext); {overrides the current extension}
begin unpack(e,unpacked_name,7); {this goes into positions |7..10|}
end;
@#
procedure change_area(@!a:area); {overrides the current area}
begin unpack(a,unpacked_name,11); {this goes into positions |11..19|}
end;

@ Here now is the first of the file name scanning procedures for {\mc WAITS}:
@↑system dependencies@>

@p procedure begin_name(@!e:ext); {start scanning a file name, with
	default extension |e|}
var k:1..19; {index into |unpacked_name|}
begin for k←1 to 19 do unpacked_name[k]←' '; {store all blanks}
change_ext(e); {store the default extension}
name_ptr←0; name_limit←6; {get ready to scan the name proper}
end;

@ The |more_name| subroutine accepts all syntactically correct {\mc WAITS}
file names, and doesn't bother to detect malformed ones, because such syntactic
errors will simply cause the file-opening routine to return |false|.

In general, none of the system-dependent file-name-scanning routines should
issue error messages, since \TEX\ uses them before it knows where to
output such messages.
@↑system dependencies@>

@p function more_name(@!c:ascii_code):boolean;
begin if c=" " then more_name←false
else	begin if c="." then {the user is specifying an explicit extension}
		begin change_ext('.   '); name_ptr←7; name_limit←10;
		end
	else if c="[" then {the user is specifying an explicit file area}
		begin unpacked_name[11]←chr(c); name_ptr←11; name_limit←19;
		end
	else if name_ptr=name_limit then
		begin if name_limit=6 then {name greater than six letters}
			begin unpacked_name[4]←unpacked_name[5];
			unpacked_name[5]←unpacked_name[6];
			unpacked_name[6]←chr(c);
			end;
		end  {otherwise we simply drop character |c|}
	else	begin incr(name_ptr); unpacked_name[name_ptr]←chr(c);
		end;
	more_name←true;
	end;
end;

@ The |end_name| routine simply packs the unpacked file name.
@↑system dependencies@>

@p procedure end_name;
begin pack(unpacked_name,1,cur_name); {set |cur_name| in required form}
end;

@ We also need another system-dependent procedure, whose duty is to
convert the |cur_name| array into a string in the string pool, and to return
the number of that string.
@↑system dependencies@>

@p function make_name_string:str_number; {converts |cur_name| to a string}
var k:1..19; {index into |cur_name|}
begin str_room(19);
for k←1 to 19 do if cur_name[k]≠' ' then append_char(xord[cur_name[k]]);
make_name_string←make_string;
end;

@ Note: A malformed \.{TFM} file might be shorter than it claims to be;
thus |eof(tfm_file)| might be true when |read_font_info| refers to
|tfm_file↑| or when it says |get(tfm_file)|. If such circumstances
cause system error messages, you will have to defeat them somehow,
for example by defining |fget| to be `\!|begin get(tfm_file);|
|if eof(tfm_file) then abort; end|\unskip'.
@↑system dependencies@>

@d fget==get(tfm_file)
@d fbyte==tfm_file↑
@d read_sixteen(#)==begin #←fbyte;
	if #>127 then abort;
	fget; #←#*@'400+fbyte;
	end
@d store_four_quarters(#)==begin fget; a←fbyte; qw.b0←a+min_quarterword;
	fget; b←fbyte; qw.b1←b+min_quarterword;
	fget; c←fbyte; qw.b2←c+min_quarterword;
	fget; d←fbyte; qw.b3←d+min_quarterword;
	#←qw;
	end
@ The last part of the postamble, following the phony font number
$2↑{32}-1$, contains |q|, a pointer to the |pst| command that started the
postamble.  An identification byte, |i|, comes next; currently this byte
is always set to@@2. (Some day we will set |i=3|, when \.{DVI} format
makes another incompatible change---perhaps in 1990.)

Following the |i| byte there are four or more bytes that are all equal to
the decimal number 223 (i.e., @'337 in octal). \TEX\ puts out four to seven of
these trailing bytes, until the total length of the file is a multiple of
four bytes, since this works out best on machines that pack four bytes per
word; but any number of 223's is allowed, as long as there are at least four
of them. In effect, 223 is a sort of signature that is added at the very end.

This curious way to end a \.{DVI} file makes it feasible for \.{DVI}-reading
programs to find the postamble first, on most computers, even though \TEX\
wants to write the postamble last. Most operating systems permit random
access to individual words or bytes of a file, so the \.{DVI} reader
starts at the end and skips backwards over the 223's until finding the
identification byte. Then it backs up four bytes, reads |q|, and goes to
byte |q| of the file. This byte should, of course, contain the value 240
(|pst|); now the postamble can be read, so the \.{DVI} reader discovers
all the information needed for typesetting the pages. Note that it is also
possible to skip through the \.{DVI} file at reasonably high speed to
locate a particular page, if that proves desirable.

The reason for reading the postamble first is that the \.{DVI} reader must
know the widths of characters, in order to find out where things go on a page;
and it needs to know the names of the fonts, so that it can get their widths
from a \.{TFM} file or from some other kind of font-information file.
The reason for writing the postamble last is that \TEX\ can't put out all
the font names until it has finished generating the pages of the \.{DVI}
file, since new fonts can occur anywhere in a \TEX\ job; and the alternative
of sprinkling font definitions throughout a \.{DVI} file is unattractive,
since that would make it necessary to read the whole file even when
printing only one page. The alternative of copying the information in the
first part of a \.{DVI} file to the end of another file that begins with
the postamble information is also unattractive, since the first part of a
\.{DVI} file is typically quite long.

Unfortunately, however, standard \PASCAL\ does not include the ability to
@↑system dependencies@>
access a random position in a file, or even to determine the length of a file.
Almost all systems nowadays provide the necessary capabilities, so \.{DVI}
format has been designed to work most efficiently with modern operating systems.
But if \.{DVI} files have to be processed under the restrictions of standard
\PASCAL, one can simply read them twice, first skipping to the postamble
and then doing the pages. Another solution that has been used in some cases
is to doctor \TEX\ so that it outputs a postamble file in addition to a
regular \.{DVI} file; the postamble should appear in both files, so that
the \.{DVI} file can be transmitted to other computers.
@* The main program.
@.INITEX@>
This is just a sketch, to be filled in later...

@p procedure close_files_and_terminate;
begin end;

@ Little by little this will take shape, I hope.

@p begin start_here: initialize; {set miscellaneous variables to initial values}
init no_new_control_sequence←false;
@<Put each of \TEX's primitives into the hash table@>;
no_new_control_sequence←true; goto start_of_TEX;@+tini@;@/
@<Initialize tables from a file written by \.{INITEX}@>;
start_of_TEX: fix_date_and_time;
@<Compute the magic offset@>;
@<Initialize the output routines@>;
@{@t that should include something like this:@>@/
	out_name←0;
	t_open_out; selector←term_only; tally←0; offset←0;
	write(term_out,banner);
	print(memory_ident); {e.g. |" (basic 82.5.17) "|}
	print_ln; @}@/
@<Get the first line of input and set the |nonstop| mode@>;
main_control; {transfer control to the chief executive}
end_of_TEX: close_files_and_terminate;
final_end: end.

@ When we perform the following code, almost all of \TEX's global
variables have been set up. Therefore nothing anomalous will happen
if we are forced to abort the job and go to |end_of_TEX| in strange
circumstances.

(I will move the following comment to |main_control| when I write it...)
At the end of this routine, \TEX\ is ready to call |get_next| to
get the first token of input. However, the token `\.{\\input}'
will be inserted automatically if |loc<limit| and
|ch_code(buffer[loc])≠escape|.

@<Get the first line...@>=
begin @<Initialize the input routines@>;
if buffer[loc]="*" then
	begin incr(loc); if (loc<limit)∧(buffer[loc]="*") then
		begin incr(loc); nonstop←silent_running;
		selector←no_print;
		end
	else nonstop←never_pausing;
	while (loc<limit)∧(buffer[loc]=" ") do incr(loc);
	end
else	nonstop←pausing;
buffer[limit]←carriage_return;
end

@* Index.
Here is where you can find all uses of each identifier in the program,
with underlined entries pointing to where the identifier was defined.
If the identifier is only one letter long, however, you get to see only
the underlined entries. All references are to section numbers instead of
page numbers.

This index also lists error messages and other aspects of the program
that you might want to look up some day. For example, the entry
for ``system dependencies'' lists all sections that should receive
special attention from people who are installing \TEX\ in a new
operating environment.